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Abstract. Correlations are known to play a crucial role in determining the structure 
of complex networks. Here we study how their presence affects the computation of the 
percolation threshold in random hypergraphs. In order to mimic the correlation in real 
network, we build hypergraphs from a generalized hidden variable ensembles and we 
study the percolation transition by mapping this problem to the fully connected Potts 
model with heterogeneous couplings. 
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1. Introduction 

In the last decade the topological properties of networks have attracted a large interest 
[D [21 El H], mainly driven by the emergence of novel dynamical effects in random 
processes defined on them |5l [6] . The impact of this research is wide and has important 
consequences in the domains of biology, socio-economical theories and technological 
infrastructure design. It has been shown that complex networks can display a universal 
behavior which strongly affects the dynamics of statistical models built on them. 

A major example is provided by the percolation transition [3 [3 [9l [10], one of the 
most famous emergent collective phenomena that can be defined on complex networks. 
Its dependence on degree distribution, degree-correlations and directionality of the 
links has been extensively studied in the last years [HI [121 113] • In particular, the 
interplay between topological features and the nature of the percolation transition 
has been fully investigated within different types of random network ensembles 
[m [13 Uni [TTl IIHI [ini 120]. These constitute null models for networks, each of them 
being formed by graphs sharing with real complex networks a number of structural 
features, such as degree distribution or correlations between neighboring components. 

Recently, attention has been devoted to the structure of hypergraphs [211 122] . 
describing, for example, many on-line social and professional communities which 
collaborate in order to give a semantic structure to a set of data. Among these 
communities, also named folksonomies, we mention Flickr or CiteULike, whose structure 
is formed by triplets of users, resources and tags linked together. Such kind of networks 
show important correlations since the interest of the user and the subject of the resources 
(for example a picture for Flickr and an article for CiteULike) usually have a strong 
inter-dependence. 

These correlations are responsible for the build-up of communities, after projecting 
these networks into networks made only by user-user, resource-resource and tag-tag 
interactions [22] 

In a recent paper pi], the percolation transition in random uncorrelated 
hypergraphs was characterized providing a first approximation to the real hypergraphs 
properties. In this work we extend these results to more general ensembles of correlated 
hypergraphs which can give a better description of real social communities. We will 
show how to build null models for hypergraphs based on recent parallel construction 
introduced for networks [IHl |20]. Moreover, we will derive the percolation threshold of 
correlated hypergraphs by mapping the problem to the solution of a fully connected 
Potts model with heterogeneous couplings [23] following the method developed in a 
recent paper on percolation phase transition in simple networks [Mj . 

2. Correlated random hypergraphs 

To mimic the correlation of real hypergraphs we propose to study randomized ensembles 
of correlated hypergraphs within the same theoretical framework developed in [191 [20] for 
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networks. In these works it has been shown how is possible to correctly define statistical 
microcanonical and canonical networks ensembles. In particular they show how to obtain 
the general relation between G{N, L) random networks ensemble - a collection of graphs 
with fixed number of nodes and links L - and the G{N,p) one formed by N nodes 
linked two by two with probability p. In the latter case the number of links L fiuctuates 
and has a Poissonian distribution with mean p x N. In the thermodynamic limit it is 
known that these two different ensembles share the same statistical properties when the 
external parameters, L and p, satisfy the following relation p = L/N. We call G{N, L) 
the microcanonical ensemble because it satisfies an hard constraint on the extensive 
number of links L - like the energy constraint for usual microcanonical ensemble -. 
While on the other case the G{n,p) ensemble is canonical, in the sense that the Lagrange 
multiplier p let the number of links fiuctuates fixing only its average value to L = px N. 
Generalizing this approach to several properties, like i.e. the degree sequence or a certain 
division into different communities, it is possible to define complex microcanonical and 
canonical network ensembles satisfying more stringent constraint [191 EO]- Here we 
sketch how this general framework can be easily generalized to hypergraphs. 

Let us consider a hypergraph formed by nodes of different types a = 1,...,K 
linked in K groups. For example in Flickr we will have K = 3 and a = 1, 2, 3 indicating 
respectively agents, pictures and tags. We assume also, in order to gain in generality, 
that each node can be associated to a different feature = 1, . . . i?" indicating a given 
classification of the nodes. Again, in the case of Flickr, the agents can be classified in 
relation to their interests or age, the tags in relation to their general meaning and the 
pictures in relation of the type of subject is represented. 

A given random correlated ensemble of hypergraphs can be defined as the set of all 
the hypergraphs which satisfy a number of constraints. 

In particular, we choose these constraints to be the number of hyperlinks 
each node has and the number of hyperlinks bridging set of nodes with different 
features. Following these prescriptions we can construct microcanonical and canonical 
hypergraphs ensembles. 

2.1. Microcanonical hypergraph ensembles 

Let us define a hypergraph by the tensor a^-^^i^^^^^i^^ = 1 if the nodes (ii,i2, ■ ■ - iK) are 
linked together and ai^^i^^,,,ij^ = otherwise. Let's call the number of nodes of type a. 
The networks in the microcanonical hypergraph ensemble will then satisfy the following 
conditions. 



\ A{xi,X2,. . .xk) = T,{t.,}ai,,...i^,...,iKllTS{ri_^ - x^) 

with kia being the hyperdegree of node ia and with A{xi,X2, ■ ■ ■ , xk) being the number 
of hyperedges between nodes of features (xi, X2 ■ ■ ■ xk)- 




(1) 
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2.2. Canonical hypergraph ensembles 

By using the statistical mechanics approach described in [191 EO] it can be shown that 
networks in the canonical hypergraph ensembles, satisfying on average the constraints 
([1]), can be constructed by assigning a hyperlink aj^_j2_...^j^ = 1 with probability 

" l + e.A2■■■0^KW{r.„r,,,...,r,,y ^ ^ 

If the constants and the tensor W{ri,r2, . . . rx) satisfy the following conditions 

A{xi,X2,...xk) = E{i.,}Ph,i2,-,iKlly=iS{ri_^ - x^) 
then the hyperdegrees ki and the number of hyperedges A{xi, X2, ■ ■ ■ xk) 

\ A{xi,X2,. . .Xk) = T,{i^}a^^,...^o.,...,iKIl^S{ri_^ - X 



1) 



are Poisson distributed with average ki^ and A{xi,X2, ■ ■ -Xk) given by ([3]). 

In the limit of hypergraphs with a linking probability independent on the features 
of the nodes r^^ we obtain that the probability ([2]) becomes 

Pi-t,i2,...,tK T j_ Q Q a ^ 

Moreover, we recover the configuration model for the uncorrelated hypergraphs taking 
the limit Ha 6'^^ -C 1 V {ia}- In this case the hyperedge probability be 

„. . . a. a. a - ' ' ' (r,\ 

PlUl2,...,lK - f^nf^»2 ■■■(^IK - (^^^^JY)(ii--l) ' 

and this last expression describes uncorrelated hypergraphs whose properties have been 
studied in ED. 



3. Potts Model and percolation transition 

It is a well known result [251 ESI EZ] of statistical mechanics that the Potts phase 
transition in a fully connected systems for the number of colors q —>■ 1, describes the bond 
percolation transition [23l EH] in Erdos Renyi G{N,p) random graphs [29]. Recently 
a fully connected Potts model with heterogeneous couplings [23l [2l] was introduced in 
order to study the percolation transition in random networks with heterogeneous degree 
distribution and additional structural properties. Here we show that such a method can 
be extended to correlated canonical hypergraphs. 

Instead of a pure Potts model we consider the following generalized Hamiltonian 

H = — ^ Jiii2...iK^Si^Si^...Si^ (6) 

where the summation runs over all the K-sets that compose the fully connected 
hypergraph and the variables {s,^} are Potts spins taking q values from [1,2, ... ,q]. We 
define for later convenience the vector s^ = {si^, . . . sn^} where Na is the number of sites 
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of a given type a. Introducing an external parameter /3 as the inverse of temperature, 
the partition function of the Hamiltonian model reads 

Z= e-^^l^'^l, (7) 

S1S2...SA' 

whose cluster expansion gives the following expression 

z= i: n vm2....'f^"^ ■ (8) 

H&K{H) hi2--iK<^E 

The value C{H) is the number of connected components of the hypergraph H and 



12, ■■■IR 1 



where the last expression is valid in the small (3 limit. After identifying the coupling 

PJil,i2,...iK 1 ~~ ' ^11^12 ■ ■ ■ ^iK ^^{^ii } "^12 ■>■■■■! ^IK ) i.^) 

J- Piii2...i{i 

then the sum in ([8]) is a weighted sum over random hypergraphs H in which each link has 
probability Pi^^i2,...,iif reported in equation ([2]). If the condition ([9]) holds, the transition 
of the Hamiltonian model ([6]) for g ^ 1 coincides with the percolation transition in the 
random correlated hypergraph ensemble. In the next section we will solve the Potts 
model providing the percolation condition for hypergraphs in the ensemble (HI) that 
could be written in the form 

det(S) = (10) 

with the matrix S to be determined in the following. 

3.1. Solving Potts model 

In this subsection we solve the Potts model ([6]), in the generic case, for arbitrary K and 
Jh,i2,-iK given by ([9]). We introduce the order parameter Cg^(s) indicating the fraction 
of nodes of type a = 1, . . . K associated to the 'hidden variable' 9 and the feature r, 
having Potts spin equal to s 

where p"'{6,r) is the probability distribution to have a node of a-type with local 
properties describes by variables 6 and r. We have introduced for convenience the 
Dirac delta function, with its proper normalization / d6 6{6) = 1 and the Kronecker's 
delta J2i ^ifl = 1 defined as 

^(^'^) = i 00 9 = 9' ^-' = 1 1 . = (''^ 
Noticing the symmetry of the Hamiltonian in terms of Cg.^{s) 



HiKrm = -n^7 E n [/ rfev(^^r^)c2;.,..(^ 



s, {r} 7 



Jm,{r}) (13) 
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we can express the partition function as a summation over these new variables 

Z= e-'^^"'=«-(^)" (14) 

where the free energy function is given by 

+ ^iV„ / dr^p"(r,Oc^.^,.(s)lnc^.^,.(s) (15) 

The phase transition of the Potts model is determined by the point at which the 
free energy becomes unstable respect to variation of the order parameters around the 
symmetric solution 

1 



(16) 



In order to determine the stability of the free energy we evaluate the Hessian Ti of 
components 

d(3F 



(17) 



dC{a}{s)dC{a'}{s') 

where we indicate by {ci}({a'}) the triplets {a, r"}({a, r"'}). Making explicitly 
the calculation, the previous equation flTTI) reads 



r X 



Vi^. n 

7 'y^a,a' 



MW,W) , (18) 



with the following definition oi5!^a},{a'} = ^a,a' S{6", 6'"') 6^^ ■ Taking J{{9}, {q}) from 
equation ([H]), we obtain that the eigenvalue problem associated to the Hessian matrix is 



[A + iV«p"(r,Og]e({a}) = ^"^Jj'"^ E H 



J de^'Yp^\e^\ r^')(3J{{e], {r})e({a'}) 



(19) 



where A and e({a}) are respectively the eigenvalue and the eigenvector of the problem 
The equation f|T9l) can be written as 

Ar«p"(r,r°) 



e({a}) 



(A + A^„p"(6'",r")g)g^-2 
with the A ({a}) defined as 



-7A({a}) 



(20) 



A(w) = E n 



•I ^7 

iV.' / rfr'^p'^'(r',r"')/?J({0},{r})e({a'}) 



(21) 
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The symmetric solution becomes unstable when the maximal eigenvalue of the Hessian 
problem becomes positive. Therefore, in order to determine the critical point of the 
Potts model for g — 1 we consider equations ( 120|) and ( 1211) when the eigenvalues vanishes 
A = 0. If we take into account the explicit form of the coupling constant given by Eq. 
(Q, we find that the vector A takes the form A({a}) = 9'^\a,r'=' where the v's satisfy 
the linear system of equations 

Sv = (22) 
with the matrix '^{oL,r°'}{a' ,t"'} given by 

X j dO'^'N^.p^' (r', r"')(r')2w^(ri, . . . , r", . . . , r°', . . . , r^). (23) 

Therefore the condition determining the critical point of the Potts model for q ^ 1 
comes from the vanishing of the determinant, i.e. det S = 0. 




3.2. Simplified cases 

In the simplified case in which the linking distribution do not depend on the communities 
{r} we have that v(a, r") = u^. In fact the coupling constant in the Potts model given 
by ([9]) becomes simply 

then the solution v(a,r'^) can be expressed as 

v(a,r") = u". (25) 

Using the definition (!23l) we get a non null solution for the u's if and only if the condition 
det $ = is satisfied, with $ defined as 

In the case K = 3 this condition reduces to the following relation 

27ri27r237r3i + 7ri27r32 + 7ri37r23 + 7r2i7r3i -1 = (27) 
with the proper identification 

VTaX =^a(^')A'Wa'. (28) 

The formula (1271) is valid in the general case of hypergraphs that show non trivial 
correlation between nodes. We recover as a special case the percolation condition in the 
uncorrelated hypergraphs, just remembering the relation between the variables ^'s and 
the mean site connectivity 9i = ki/{{k)Ny^^ . Therefore, using the previous condition, 
the tt's are given by 

TTo.a' = (29) 
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and after some algebra we obtain the condition for the percolation already found in 



(30) 



4. Conclusions 



Correlations account for the non-trivial structure of complex networks and must play a 
significant role also in the characterization of hypergraphs describing folksonomies. In 
this paper we have studied ensembles of correlated hypergraphs which can be used to 
model the interactions between different types of nodes in real complex hypergraphs. We 
determined the percolation threshold by mapping this problem to a fully connected Potts 
model with heterogeneous couplings. Our approach extends the present knowledge on 
percolation in uncorrelated hypergraph. Future development will link these findings to 
the study and characterization of real folksonomies and to the analysis of the robustness 
of the giant component phase against the removal of nodes or hyperedges. 
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